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simulation enforces rigor* in theory specification is analyzed; . the * 
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source of new ideas a$>out cognitive processing mechanisms, with 
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(HPM) program, the *Collabor^ive Activation-based Production System 
(CARS) / and the READER modelT^Psychological simulation languages are 
then discussed, as ar^r- aspects of programming environments, which 
facilitate simulation work. In 'closing, a new simulatipn ^language , 
£he Program for Research into Self-Modifying Systems (PR'ISM) , is 
described in* detail.' Eight .figures and a 50-item reference' list 
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Abstract 



Three views of the function of computer simulation in cogiritive 
psychology>re analyzed. The strong view tfiat computer simulations will- 
produce mire rigorou^p^ified theories is seen to be overstating the 
case. Two more pragmatic views are supported. One lo,oks at computer 
method as a means of exploring or validating' psychological theories.' 
* The other looks to computer simulation, as a- source of useful concepts. 
■ Several recent Simulation effort's 'are . presented a* illustrations "of 
these .latter^iews'. After establishing some perspective on the uses of 
simulation, the diT&usiio* ^n^sycholpgical simulation* languages, 
, ^ *° - as P ects o£ , Programming environments which facilitate simulation 
work. A new simulation language, PRISM, is described. PRISM'S design 
is intended as a response to some of the issues raised in this paper. " 
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1.0 ' OVERVIEW • 

> 

Although the primary purpose of this, paper is to discuss simulation 
systems, how we view simulation as a methodology strongly affects our 
perceptions ,of whajc constitutes a useful simulation system. Therefore, 
the first part of this discussidn "considers several, common views of the 

* 

^ role of simulation in cognitive psychology. In the process of 
each of these views, I will be making some assertions about 

IV - • • " 

V, ; "sefu^ principles of simulation, and 'reviewing instances of simulation 
\ WOrk WhlCh illuStr > ate th0jse Principles. Orwse some perspective is 
^established regarding simulation's uses, I will turn to a discussion of 
where I believe .simulation work . is heading. That discussion will ■ 
cohsi^ex the rise and fall of sofce* past psychological simulatioji 
languages, as ,a means of focusing 1 attention on asjltects of programming 
environments that "facilitate simulation work in general. ' . N 



\ 

\ 

pchol 

^11 focus on the design of a new production system ^language called 



Finally, close with ajliscussicyi of a . particular class of 

psychological simulation languages, production systems./ That discussion 
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PRISM, which - is being, developed in collaboration with Pat Langley of 
Carnegie-Mellon University (langley & Neche^, 1981), - J 

2,0 SIMULATION AS POLICEMAN OF THEORETICAL RIGjDR 



I'd like to start by exorcising a 



ghost 



, iti the form of an extreme 



argument for* simulation that^was propounded rather vigorously in the 



the claim that computer 
enforcing greater rigor in 



late 1960's and early 1970' s. This was 
simulation was a , superior formalism for 
theory Ispecificat ion, 

A strong example of this particular argument abpears in Gregg & Simon's 
(1967) article using concept formation as i demonstration domaith for 
information processing models. Embedded in It Hat article x were five 
claims for the advantages 'of requiring that running computer programs be 
associated with psychological theories: 



- Inconsistencies would be prevented by the need to specify, a 
particular set of operations in brder to implement a 
^hypothesized' psychologic^f process, the \same set of- operations 
would have to suffice^for all cases in 'tyiicfa that process was 



evoked • 



Licit asSump 



- Untested implicit assumptions would be rendered impossible by 
i 

the need to specify a complete set of processes, A program 
which does not specify processes completely cpuld not run. 



0 
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- Overly flexible theories which could too easily fit data wotU^ 
be prevented by the fact 'that computer programs contain go 
numerical parameters* * 

- Untenable theories would be eliminate<J by virtue of ihe 
"specific sequence 9 of operations generated by a program, wqich 

could be treated as predictions about intermediate processes. 

These predictions could be compared against process .tracing \ 

^ data, such as verbal prot^co^ or eye movements, thus" allowing 

much more specific teats of a model (1), 

* * s, . • 

- The nedl for a program to operate upon specific data would 

prevent - finessing critical questions about encoding and * 

* » 

\ representation. 

There are some positive examples supporting* these claims J John 
Anderson, one cognitive psychologist clearly influenced by the * 
# Simulation approach (Anderson, 1976), has* produced 'a ve'ry derailed 
theory which is often relatively specific in its claims. a His work has 

stimulated- a number of studies, both supporting and opposing. 

t - * 

However , in spite of "positive examples Such as his, it is harti. to 
s'ay that simulation 3 was tne ^ causal factor in the development of a 
detailed model Jj|gertainly the history of psychology contains a number 
* of comprehensive theories not cast in„a computational formalism. 




Footnote 1: This, and the preceding point, is particUiarly*important if 
t one adopts , Popper's 0959} view of science. Copper suggested that the 
dominant goal is to refute theories rather than support them r / with a °^ 
theory being "acceptea"' only so long as no evidence can be found counter 
to it. • In that view, a theory is, best if it is highly specific and 
therefore amenable % to disconf irmation.. In that case, either the caused. , 
for its disc onf irmation leads tea new 4 and better theory, or the failure 
, to disconf irm lends credence to it. 

• ' 6 ' 



2;2 Six Problems With The Five. Claims 

Furthermore , experience with Simulation since the early days of Gregg & 

Simon (if 61) has .shown that there are a number of ways to avoid rigwr 

4 * X 

while doing simulation wortt: 

» 

- A formal specification of a model needn't imply a comprehensible 
presentation; since programs are rarely presented in full with 
accompanying documentation, we remain dependent on verbal 
descriptions of the model. This can raise problems ia determining 
whether the program performs as it does for the reasons claimed by 
its author. For example, see Hanna & Ritchie's (undated) analysis 
of Lenat's (1976, 1977) AM program, a system which has. received a 
great deal, of attention, in the Artificial Intelligence community 
for its apparent ability to re-discover a number of interesting 
mathematical ^ theorems. Hanna and Ritchie suggest several points 
that contribute td its performance 1 , but where the actual program 
appears inconsistent with the general principles Lenat presented. 
They also raise instantiations of IfSur of the live potential, 
problems listed b?loV. - - 

* v m 

Programs frequently involve* simplifying 'assumptions in ordejr to 
_ "facilitate implementation. ^Ihese .simplifications howe'ver , cause 
the program to diverge from the theory ^t supposedly represents. 

c • 

- Programs can be written to work only for a restricted set of 

examples, 'those presented in the write-up of the research. In the ~ 
absence of some analysis of the formal properties « of the* 'domain, 
there is r>o automatic guarantee that the examples presented are 
representative of ' the domain, or- t hit the prtncfples required to 



handle a given set of examples are sufficient to .account' for the 
entire domain. 



The inputs or database for , the program <*n be structured m ways 

that simplffy its task, but which ' are • not necessarily 

psychologically* plausible. That is, the real tfprk of performing a 

* t . as k _ ma y A e _.4? n ®. be f o r e the pr pg f am _ is, jit a r t ed * . _ l 

, * . if m ~ - - — - 

- Data or procedures supplied to the program to define different 
examples for it to. handle may, in fact, constitute non-numerical' 
parameters that give . the, program considerable flexibility in 
fitting psychological data. Newell & Simon ( 1972, page 56), for 
example; admit that the operators and table of .differences supplied 
to , GPS constitute such parameters. ' : 

) 

- The programmer may hold back* data or procedures that wo*u\d have 
^confosed the program had it been available. That is, the program 

( a Ppear-to-perfonii well not because it has the capacity to , 

choose the correct , action -from all possibilities, but rather 
. * because the difficult choices are ^ot offered to it. ' 

For all the above reasons, there is no immediate assurance that v a 
program's consistency with psychological data means the program fs of 
psychological* significance. Nor, on the other . hand, is an 
inconsistency necessarily a sign- of failure. For example, Newell & 
Simon f 1972, page 47,2V admit .to a number of exceptions to, GPS' account 
of protocols obtained from subjects'* solving logic problems. 
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Although Newell and Simon are fond of claiming that the test of a 
a theory is running program, this, is no more true than claiming that 
the true test of an experiment's validity is a 0.05 significance level. 

1 0 

The real question is how. and why a particular resul't was obtained. The 
claim that computer simulation will necessarily le^ad to clearer and 
more rigorous psychological models does nat hold-up. 



It is perhaps better seen from a .historical/' perspective, as an 
argument stemming' ^partly from the days of m simplej: programs, but 

V 9 

primarily from a weed to* make a case N fot the respectability of 
simulation mfethoc|ology compared to established mathematical modelling 

^ : ' * \ 

and* experimental approaches. Unfortunately, the •proponerttk, of 
simulation ^approaches have, if anything, damaged the credibility of 
their case by overstating.it* " 

I \ " ( ' -\' ' 

3.0 SIMULATION AS 'A METHOD OF EXPLORING OR VALIDATING THEORIES 

* * 
# Therefore, I 'd ^ike to turn 'to some less ambi'tious views of 

N 

i 

simulation, in which* a computer implementation is viewed not as a 
necessary formalism for expressing a cfodel, but rather as simply one of 
several means for gathering information about it. Even this more 
restricted view may. still be controversial* '. > 

• * * ' 

3.1 The Significance Of A Running Program ^ ' , 

H 

*i _ 

One of the issues in the controversy ifi the „ significance of the fact 
that a * program runs. L. Miller (1978) does a very nice job of 
summarising the debate , * which he suggests stems from alternative 
assumptions about the difficulty of theory validation. One side, he 
claims, believes that theories are easy to generate but difficult to, 

.9 • 

, 4 ^ 
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te$D. , The .other believes that; a good theory is a significant and 
difficult accomplishment, and is accordingly more impressed by a 1 
demonstration of a model?' s m sufficiency through the successful* , 
implementation of a computer program. 

A related question has to do. with the ultimate discriminabf lity of | 
psycho logical mode rs^s^Andexsoj^CiStlS-) ,-f«0JL_ ex^mple^-has-claimed--that 



many di fferent^ model s tan produce empirically identical predictions, 

and - has even gone so far as to suggest that it is futile to try to, 

distinguish which alternative is correct by ) experimental methods. 

Naturally,, this claim. has been- disputed. Hayes-koth ( 1979) has offered* 

one of the more detailed^response's, basically arguing that if two,, sets 

of processes are not identical, then it shourd be possible to find some 
* 

form of process 'tracing data for which the two sets make different 
predictions. Without taking a firm position*on the ultimate resolution 

to these questions, we still can say tussiculation gives a means^fcf 

' - * * * t * 

exploring .the plausibility of models Where theoretical sophistication 

exceeds the state* of the art in empirical testing. 

x 9 
* > v 

In such cases, there are a numb'er of ways that motalling can aid* 

' / . . * 

our thinking. The demonstration that a theory is sufficiently powerful 

to guide implementation pf a working program is certainly j .encouraging 

for its" credibility.^ Efforts to produce working programs can also lead 

to a better understanding of the computational requirements ola task, 

which iifturn can Tie lp" to constrain the set of plausible theories. 

* ? *' ' - ' 

n \ mm 
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* % 

3.2 Empirical Analyses .Of programs fc • 

Another important contribution of simulation comes from^ our greater 
freedom to perform psycho-s*urgety on a program, sit\ce no* clearance from 
a, Human Subjects Committee 4 is' required in order to modify a, computer 
simulation. This permits use of simulation for experiments that would 
be ^unethical" or fcnposrsible with human subjects, experiments that can 
help in understanding the interattijpns between components in complex 

. - - \ . \ : • 

model 8* I'd like to offer McClelland &• Rumelhart'S (1981) model of 
word perception aV an interesting example of this. 



• McQellan£/& Rumelhar^ (1981) vere concerned with explaining a 
number of phenomena in the perception of words % and letters in 
tactUstoscopically presented displays. Among their key concerns were* 
(a) modelling the process of recognising words and letters within 
words; (b) explaining the facilitating effect of pseido-words for 
letter recognition; (c) explaining, the sensitivity of the pseudo-word 
effect * to expectations Wout what will be presented; * and, (d) 
explaining the differential effects of ? various kinds, of masks* 

y * V • * 

The model which they built assumed a highly-linked structure of 

v .•- \ • . i 

nodes, representing hypotheses at various levels about what stimulus 
was presented. An example ,of such a 'structure^ is illustrated in Figure 
\jt Each* node has an activation level* a&socfat^ed with it, which 
represents the •model'' s confidence at the current time in the hypothesis 
represented by the " node. Hypothesis nodes vary in their baseline 
activation level. 4 „ \ . * 1 



Each node has a large number of weighted dinks to other hypothesis 
•nodes. Excitatory links, send activation to hypotheses consistent with 
"a node. . Inhibitory links, decrease , ^ activation of 'inconsistent 
hypotheses, j * 

The 'activation of a node at any point^ifl time, is a function of its 
* « 1 0 

baseline activation and the excitatory 'and inhibitory activation 

received from"$flated .hypothesis nodes. The* \ /function used modulated 

t ■» 1 • 

the activation level to keep it within a restricted range and allow for 

time decay*. Activation reverberates through t£e netvork, arid f at some 

point in time whichever hypothesis is most active at .that point is 

r . * 

» * . • "ft • 

accepted as true. * > x 



In this model",* t°e word superiority effect and^the facilitating 

.effect ojt words on letter recognition were explained in terms of 
• ** * ^ * 

activation flows to and from nodes at the word hypothesis ' level". , The 

facilitating effect of pseudo-words on letter recognition could be 

r fed 

understood as an outcome of partially activated word hypotheses 
reipforcing the letters. For.S£ample,, the pseudo-word "TROP" contains 
^letters which would activate hyfro theses such aaj "TRIP", , "TRAP 1 ', and 

? ' • k 1 W *r* I - 

"PROP 11 *; these, in turn, wauld{^i^7a^lvAtion back- to the^fcypotheses 

for thtf letters T, 11 R , \"Q%\ ahd^ M P finally, the effects of 

, t t 

yarious. kinds of masks w^re* explained in terms of the relative times at 
which* acHv at ion for the iaskfgjfe^ to levels sufficient to interfere 

K * * , " . < 

•with activation, for a target; \ # 

- ' « * ! * < • T 

Tfcis laatt'effect deserves discussion itf some more "detail , because - 

it nicely illustrates some of the advantages obtained through computer, 

modelling. The general phenomena which McClelland and ^umelhart tried 

to capture was as follows. When a tachistoscopically presented target 
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display is followed closely by presents: l0 «C of a mask d lsp iay ( .a' number 
of 'factors affect- the extent to which the mask will interfere with 
^ . recognition of the target. The basic findings of interest involve 
comparing letter and word' recognition for three different kinds of 
masks: feature -masks consisting of -^tter-like geometrical shapes, 
*' -A&.V'*' maSkS conslsc ! in 8 ° f non-word letter string^'and word masks. A 
•'«#^ bet ° £ SCudi " have shown C ^ C l«"r recognition is about equally 
, ' . <>p c ; ed three kinds of masks, while word recognition is 

markedly less affected by feature masks than by letter or word masks. ' 

Given the formulation of their model, the uniform effects of 'the 
three different ' kinds of masks on letter recognition are easily 
understood. All ^ three kinds of masks' quickly engender competing 
■I'- hypotheses at ^ke letter lev'efc These can ' depress the correct 
hypothesis' activation through their inhibitory links" before- that 
hypothesis can reach. its peak activation level. 

In the case of worti recognition, the difference in effects between 
^ feature masks and o.thers is somewhat more complicated to wafers tend. 

McClelland & Rumelhart, in spite of a long and fairly detailed 
^ . discussion of their, model, do not mate it clear wh/it produces the 
desired effect. (This is worth no'ting , in the light of Gregg & Simon's* 
/ claims that computer simulation would eliminate exactly this kirkjt 
uncertainty.) » 

It appears their explanation is that random feature di'splays 
_ weak! / activate many different letter .hypotheses, rather. than stroiigly " 
activating a few. Thus, none of the comparing alternatives. have enough 
strength for their inhibitory links to have an immediate effect on the 
,' activation for the correct hypothesis^ One indication that tFUs .is t 




I 
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indeed the intended explanation comes from their report that the 
program *was very sensitive to the degree of similarity between feature^ 
in the^mask and the target. 

This is an interesting point, because we see here that the program, 
is perhaps just as complex -for an outsider to understand a§ a verbally 

t V 

stated model. However, there are some real differences in^-the value of 
a program over a verbal„model in situations where the complexity of a 
theory obscures its implications. With .the program 1 — unlike a 
verbally expressed.theory — it is possible to perform manipulations to 
help understand exactly what factors contribute to its performance. 
For .example, having determined tftat the program was sensitive to 
similarities at the feature level, McClelland and Rumelhart set out to 
equate their stimuli in order to eliminate that confounding factor. 

Doing that required coming up with feature, letter, and word masks 

I 

which all three had jus! as many features same/different with respect 
to the target display. Worse yet, to properly equate the stimuli., the 
equivalences had to hold letter-by-letter, for each letter position in 
a four character, string . 

This would be, a rather Haunting task if the stimuli had to be 
created for human subjects in an experimental desifen of any statistical 
rigor. It is difficult to create even one grouping of a * target word 
and three masks which would satisfy these criteria. FortOnately, in 



evaluating the performance of the program, one is all that is needed. 
Since r^ie program is a deterministic entity, there is no concern of 
statistical error. When running experiments with a program, the only 
concern is*" with finding a range of inpujs that verify the generality of 
the results. The need to be concerned with noise, or tfne statistical 
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reliability of measurements of / the program's performance, is 
eliminated. 

Even with the statistical issue of noise eliminated, though, it is 

still difficult to construct stimuli in this particular case, 

McClelland".& Rumelhart's ability to d^ so illustrates yet another 

virtue of models implemented as running programs, the ability to turn 

thought-experiments into real tests of a theory. To create stimuli 

meeting the desired criteria, they simply modified xhe knowledge base 

of their program. For example, they selected asa.^Xarget string the 

word, "MOLD". As a letter mask, they seated the string, "ARAT" , In 

the specialized character font^'used in the experiments simulated, the 

letters of "ARAT" and the fetters of "MOLD" had, respectively, 2 

similar features in the first letter position, 3* in^ the second, \l in 

* * 
the third, and 2 in the fourth. 

* ' 

It was easy to produce a feature string withthe same number of 
similarities to the target string "MOLD" . Where the constraints up^n 
the stimuli become tricky is in finding a common four-letter word which 
also has the same pattern of jaim'ilarities. ftowever, because a program 
can be much more easily modified than a human tfiind, McClel/and & 
Rumelhart were^ able to sidestep the constraint, Jtfter obtaining the 

re8l ^^ 0f fUnnin8 Cheif ? r ° 8ratn With Ch * leccer storing "ARAT" used as ^ 
the mask fdr "MOLD", _t hey simply modified the program's database so 
that "ARAT" was now represented as a known word. When they then ran 
the program again, the results §f the new run could be interpreted as 
representing a word mask rather than a letter mask. * Thus, they were 
able to explore the effect of; top-down knowledge about words without 
the confounding effects of feature differences due to different letter 
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A 

strings* ■ . ^ 

• To see where some *of those* confounding effects'could be produced, 
and to see another virtue 6,f analyzing the performance of a computer 
model r we need to consider 'some other observations made by McClelland -& 

Rumelhart. 

o t 
, Since programs can befmodified at an>r point, it is possible to 

insert code to record virtually any kind of data about its run-time 

characteristics. This can permit' one to make observations about 

implications of a model wftich might not come" out nearly as clearly 

otherwise. For example, tracing the time , course of activation flow 

enabled *HcCleliand & Rumelhart to analyze three different factors 

influencing activation level. 

The 9 first they called the "friends and enemies etfec&bC 
Activation is clearly going to depend on the number |>f excitatory and 
inhibitory links from other active nodes. Thus, the likelihood of a 
hypothesis being accepted, whether correct or not, is partly dependent 
on the relative amount of knowledge which^he system has. stored about 
it, * - - 

tfhe second effect they called the "rich get richer* 1 effect, the 

empirical observation that feedback loops inherent to the- structure 
* * » 

greatly accentuate 'over time any initial differences in baseline 

activation levels. This is one of several aspects of the model which 

offer accounts of expectation effects. In particular, by making^ 

baseline activation encode, word frequency, they were able to simulate 

common frequency effects. Figure 2 illustrates this, by \ showing how 

small initial differences In activation due to differing frequency were 
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Figure 2 
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enhanced over time for three alternative hypotheses entertained by the 
program when presented with the string "MAVE". Note that all three 
hypotheses have three letters in common with the presentation string, 
and thus all receive equal bottom-up support. 

i h 
The third effect was called the "gang effect"/ Observation of the 

program* showed that stijong hypotheses at a given level indirectly 

reinlorced a subset of their competitors at the same level, those that 

defended on the same supporting evidence. This is because a hypothesis 

node sends activati'on to lower-level nodes, which' in turn send 

# 

increased activation not only back to that node, but flso to all other 
higher-level nodes to which they are linked. Figure -3, for example, 
shows how three additional hypotheses fare over time in response 0|o the 
same presentation string, "MA^E". Once again, all three .alternatives 
have three letters out of four in common with the string actually 
presented, and so start out with initial bottpm-up activation. 
However, "SAVE" ^directly recefves activation from five other word 
-hypotheses that-boost the activation of the letters "A", "V", and I'E" 
(e.g., "HAVE" and "GAVE"). # Similarly, the program had stored five 
other words involving the ^letters "M",, "A", and "E", and those 
alternative word hypotheses boosted the activation levels for "HALE"' by 
way of those three shared letter hypotheses. On the other hand,, there 



were -no other hypotheses involving "M" , "V", and "E" to indirectly 
support the hypothesis that the word . seen was "MOVE". Thus, itf 
activation is markedly lower than for the other alternatives. 



I* 
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Figure 3 
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» » 
3.3 There Are No Simple Standards 

5 

It is interesting to note that this simulation does not at all fulfill 
the promised advantages of simulation outlined by Gregg & Simon (1967) > 
but instead illustrates the objections to their claims outlined in 
section 2.2* We were promised specificity through parameter-free 
models; McClelland and Rumelhart present a full-page table listing 
parameters; and vary the settings in simulating different experiments. 
We were promised deepex^ concern with encoding and representation; tjiey 
present a system which pre-codes information about letter position (and 
which iequires .creating such a large v number of links for exciting 
consistent hypotheses And Inhabiting inconsistent alternatives that one 
has Jto wonder about the psychological processes required to add a new 
piece of knowledge). Finally, we were promised extensibility to 
related tasks; they presented a program which could not even easily be 
modified to handle ijtv£-letter words. _ . " , 

However, these objections, really do injustice „ to what we 
instinctively know is a respectable piece of work 4 . The problem is with 
the standards offered by Gregg & Simon., which basically amount ' to a 
* promise* that we will never again have to think hard to understand^ or 
evaluate someone else's work. Those standards do < not fully capture 

J t I . 

what can be gained by simulation. 

i 

McClelland V & Rumeltjart's observations about Interactions between 

v * 

components*^* of , the model "are significant because of their implications 
lor other work, a point which I'll returti to belo'w^ What is of 
interest for the momeTifc^ thotffch, is that the ability to perform 
empirical analyses of a program has enabled them to provide greater 
insight into the implications of thej.r model. In addition to 



5 I 



information about how well the model accounts ffix a bod/ of^fata^^he* « • 
capacity to perform experiments aYid make Qbservat^^^o^^^r^Vaa^- 



means that we can also get information about wh£ tn^eimodel suc^Sd* 



fails. 




4.0 SIMULATION AS A SOURCE OF NEW IDEAS 



Another view of simulation is as a source of new ideas * ^9Vj£$T *v 



processing mechanisms, whidh implies a close partnership 
cognitive psychology and artificial intelligence. Psychology, in^Jnijfe . 

^ lA * • » 

or recent claims to the contrary, has made several contributions to, A1-T 
Among them are the notions of meansr-ends analysis embodied in GPS 
(Ernst .& Newell, 1969; Newell & Simon, 1972), of discrimination nets 
' (Feigenbaum, 1961; $imon & Feigenbaum, 1*964), and oi various semantic' 
network representations . (e.g. „ Kintsch, 1974; Norman'* Rumelhart, A 
1973;. Anderson, 1976). " 

Psychology has certainly been influenced by AI. Winograd's (1972) 
SHRDLU, for "example, was considered of sufficient importance to have an 
entire issue of Cognitive Psychology devoted to it. Another important, 

~~ f y 

.although perhaps, not as well-Joiown., example is the HEARSAY speech 

understanding system (Erman & Lessor, 1975). That system introduced 
* •-• 

notions of a central memory structure shared by co-operatirigy>aral^el - 
knowledge sources ; these notions have influenced psychologists in 
topics ranging from models of reading processes (Rumelhart,*. 1977) to' 
planning (Hayes-Roth & Hayes-Roth, 1979). Scripts (Schank & Abelson, 
^19A7), .frames (Minsky,* 1975), or schemata (Bobrow & Norman , 1975) have 
generated'a number of lines of research, as has the work on story 
grammars \Rumelhart>v 1975; Mandler, 1977; Thomdyke, 1977). - 
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Although the examples just mentioned are all cases where ideas 
about processes have v been transferred fairly directly, simulation work 



can have a much more subtle impact cui pstyc ho logical thinking. This is 
because solutions to sub-problems encountered in the .course of 
implementing a program can turn out to have implications for 
psychological issues that ^the program was not originally intended to 
address.. Often, this can help us gain a teleological, understanding' of 
mechanisms, by making u& aware of constraints that necessitate their 
existence or force 1 them to operate in a particular way. 



All computer programs are fundamentally concerned with issues of 
-~ control and focus of attention (or, to put it less elegantly, getting 
the right tiftngfe'done at the right time). Thus, the process of 

developing a simulation can suggest domain- independent mechanisms which 

+ '» ' 

other researchers can apply in developing models of behavior in quite 

* different tot/ic- areas * 

To illuftrate these rather abstract claims, I will first discuss 
some of my own work on a learning simulation called HPH, then describe 
.a simulation of eye fixations in reading (Thibadeau, Just, & Carpenter," 

• 1 r ' 

1981); s Md briefly return to McClelland & Rumelhart's (1981) word 
perception model* I will try to show how these disparate systems 

I 

contribute a model of sloppy errors in algebra problem-solving. 



4.1 HPM: An Example Of A Spin-off Discovery 



The HfM (for Heuristic Procedure Modification ) program is* a model of 
leart](ing^ through the incremental refinement of procedures (NechesY 
198ml, 1981b). Although primarily concerned with learning, it turn's 
out/ to provide a new explanation for an old observation from the days 



oFgesfalt psychology' called the Ze^garnic effect. (For an English 
"description of this effect , .see '" Levin ,• 19.35, 'pages 243-2*7.) The 
effetft, which Gestaltiscs .interpreted as illustrating the phenomenon of 
"closure", boils down to the observation Chat delayed recalls of a task 
are richer and more detailed when' subjects were stopped part-way 
. through Che^task than when they were allowed to carry the task through 
to completion. ° 

• In order to make clear -HPM 5 s account of this phenomenon, it is 
necessary t 0< provide some background about the program. HPM is a 
production system, which means that it belongs to the class of- 
programming languages in which procedures are specified' as a- set of' 
condition-action rules and data is represented ' as 'propositions in a 
Y> rkln 8 P^'°ry- The system runs through a cycle 'of finding the set of 
productions whose conditions are satisfied by the current' contents, of 
working memory, selecting a subset of those rules for execution, and' 
modifying the contents of working meWy' according to the actions 
specified by the rules selected for execution., . «, • * 

The program was inspired by protocol -studies by myself' (Neches, 
1981b) * and others (e.g., Anzai & Simon, 1979) indicating that people 
use a number of common-sense heuristics to improve their procedures on, 
the basis of experience applying them to' a task. Most of the 
simulation work has concentrated on .getting the sysSem to'/acquire an 
addition ^strategy similar to that used by many second-graders, given a 
simpler strategy employed by most pre-schoolers. Figure* 4 .shows the. 
heuristics which seem to be most relevant to- this task, along with a 
sequence of strategies, that the system discovers. The initial strategy 
adds two numbers by counting out a set of objects cor res ponding to each 
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addend, combining chose two sets, and counting the totll set. The 
final strategy adds tne numbers by incrementing the larger addend a ' 
number of times^given by the smaller addend. 



^ HPM. was designed as, a vehicle ( for exploring the problem of 
^ - operationalizing heuristics such as chose in Figure 4. Thus, che kin d s 
of questiohs 1 was concerned with were ones like, "What sort of 
information about a procedure is necessary in order Co apply heuristics 
like these?"- 

* s 

The answer embodied in, HPM involves solving problems by setting 'up 
a hierarchical goal structure not unlike Sacerdoji's (1977) planning 
. ' nets. Productions in HPM respond to nodes in a partially-constructed 
goal structure by addinfe propositions that further elaborate the goal 
structure. Whenever a 'production , fires , a linkage**is established 
^ ' between the propositions which satisfied it's conditions (i.e., caused 
its firing), and' the propositions T which were 4.dded as its -anions. 
^ This information allows, HPM to' implement heuristics like those of 

Figure 4 as sets of productions which lopk. for" configurations" in goal " 
* .' structures 'indicative, of ' inefficiencies. ,The program 'represent, 
learning by using the formation to construct new productions,' with ' 
conditions that cause them to 'fire in circumstances' when th'e 
inefficiency is 'like^ to be repeated., The information allows, the 
' - productions to construct' actions for the new productions that cause the 
system to sidestep the inefficiency. " ' • * " . 

Figure 5 illustrates the structures in HPM's 'memory, after 
fc- executing its first production for addition • in response to an 
^ ^ externally supplied goal to add two numbers.; When we remember that the 
.semantic oetwork shown *n this figure represents only, knowledge about 
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the first of a large number of steps to be taken, it is easy to ~ see 
that a huge, body of information must be retained in order for, the 
system to represent a complete problem-solving- s^uence. (For an 
explanation of the necessity of the information retained, see Jteches, 
1981b, section 5.2.) 



From both the Computational consideration of minimizing the size 
of the database to be searched , and' the psychological consideration of 
limited shorter* memory, it was essential to have some mechanisms in 
.the system which would cut down the number of propositions required for 
consideration without eliminating any critical information. 

The mechanism, adopted in HPM assumed an extremely rapid decay of 

'"woTfang memory j contents; positions drop out of working memory 

unless used'tflth'in two processing cycles. The propositions in working 

memory consisted of thofie required to specify the current goal, plus a 

se^-brought- in from long*term memory by a spreading activation* process . 

To' reduce the number of propositions brought' in from long term memory, 

activation was assumed to spread 'uneveply through the semantic network., 

with trie primarjj direction in which it spread being dependent^ the" 

* j ' * < * 

processing status of the current goal. ' **V- 

Specifically, when a new goal is initiated, HPM sends- activation 
down through the network to retrieve information most likely to be 
helpful in deciding how to process the goal. When ^an old goal is 
-Mrainajjed, HPM sends activation up the hierarchy towards higher goals 
.and sideways towards planned successor goals, thus* retrieving 
information most , likely to be helpful in deciding what action to take 
next. Although this part of the model was developed 'in .response to- 
computational overloads produced by large semantic structures, it turns 
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out in retrospect to provide a psychologically plausible account of the 
Zeigarnic effect* In this account, the effect is an outcome of 
associative retrieval processes primarily intended to minimize the size 
of working memory neede4 for processing goal structures* 

Assume that, as in HPM t a goal structure is built as a taste and is 
carried out in which goal nodes are represented as either active or 
completed* ..In the .case where the task is interrupted before 
completion , the rapid decay process causes their loss from active 
memory; they are, however, retained , in long term memory* r The 
instruction to give a recall causes retrieval of some of the 
higher-level nodes in the goal structure, since these are the nodes 
that define the task* Because these goals are represented as active* 
their return is treated as a re-initiation, and activation is sent down 
the network according to the processes outlined above* This retrieves 
a set of nodes which contains more detailed information about the task, 
since it consists of the more specific sub-goals set up to perform the 
task, along with information about the operands of those goals. 

a 

On the other hand, if the task is allowed to go * through to 
completion , the goal nodes are all represented as completed when they 
return to long tirm memory. If the same higher-level nodes are 
retrieved due to a recall instruction in that case, HPM will*try to 
send activation up and sideways through the network. Since v the goals 
it wdrks from are already near the top of 'the structure, there is 
simply, not much up to go. tffcM therefore retrieves, a smaller set* of 
propositions, which furthermore consist, of more general .and abstract 
propositions because they are drawn from near the top of the goal 
structure. - 
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4 The significant point of this example is that the demands^ of 
formalizing a model in computational terms led to new ideas about 
issues not initially seen as related to modelling learning processes. 
HFM, although basically a model of learning, led to development of a 
notion of directed activation — 3 distinct variant upon current 

e 

notions of unfocused spreading activation (Collins & Loftus, 1975; 
Anderson, 1976). An additional property of the simulation is that £t 
gives us some insight int6 the teleological role of activation iji an 
information processing system, the simulation suggests that it should, 
be viewed not only as a mechanism for focus of attention or information 
retrieval, but also as a component of a larger mechanism for minimizing 
working memory loads. In chat larger mechanism, activation m^y serve 
to enable relatively drastic measures for eliminating propositions from 
active memory, by providing an assurance ^that critical propositions 
wilj return when needed, " 

4,2 READER And CAPS: An Example Of Concern With Control Processes 

It is worthwhile to consider another example of directed activation, 
Thibadeau's READER model, which develops the* notion in a much more 
sophisticated way* . Thibadeau (1981; Thibadeau, Just, 4 Carpenter, 
1981) has developed a production system language^called CAPS, in order 
to implement the READER model, CAPS is a programming architecture of 
some interest, only in part because it illustrates another useful 
Property of simulation research: the development o^f general notions of 
control and focus of attention, z 
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READER'S mission is to 'account for gaze v duration data from eye 
movement studies of* reading. It is similar in some respects to 
KcClelland & Rumelhart's word perception model , but differs in 
implementation and modefs a broader range of processes. The 
similarities stem from the notion of nodes representing hypotheses with 
activation levels representing confidence in the, correctness of the 
hypothesis, excitatory relations to pther hypotheses consistent with a 
given hypothesis, and inhibitory relations to others which ate 
inconsistent. Rather than doing parallel processing on a feature array 
representing a four-letter .character string, as McClelland and 
Rumelhart's program did, READER sequentially processes a String- of 
letters and spaces representing a paragraph of text. Hypotheses in 
READER are maintained at the letter-cluster, word, syntactic, and 
semantic, levels. The system tries to do as much as possible at all 
levels before moving on to the next input element. These properties 
allow the model to explain gaze durations in terms of the time required 
for hypotheses to rise above the threshold for acceptance and thus 
<allow the system to move on. 

The READER model offers explanations for a number of effects. For 
example, at the word encoding level, the sequential processing of the 
input string causes^ the system to take mores time to activate longer 
words, reproducing the linear increase in gaze duration found in data 
from human subjects. . Gaze duration also turns out to be a log function 
of word frequency, a phenomenon modelled in READER as essentially 
similar to McClelland & Rumelhart's "rich get richer" effect on 
baseline activation levels. At the syntactic parsing level, the system 
displays a number of effects similar to those observed in the human 
data, most of which occur because of. the way that interacting semantic 
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i * 

and syntactic processes contribute to activation levels of syntactic 
hypotheses. * 4 

'v * . .. 

Among other things, the collaboration between semantic and 
syntactic processes 'allows the system to parse difficult noun phrases 
like, "the greater the mass" (det adj det nounX. It also produces the 
^negative correlation observed in humans between the number of modifiers 
in a noun phrase and the fixation time for the head noun. The more 
+ modifiers there are, the m^re semantic .constraints Imposed, thus 
pre-raising the activation levels for likely candidates for the noun 
itself, and thereby decreasing the t^me required to raise the correct 
alternative above the threshold for acceptance. Much the same prqcess 
underlies READER'S ability to duplicate human subjects' tendency to 
skip over function words entirely. 

Final ly^"the processing structure of the READER system, which 
enables it to do as much processing as possible at ill levels before' 
moving on to the next input, allow it to reproduce several effects at 

r 

the semantic level, such as increased gaze durations at the first 
mention of a tppic and at the end of sentences. 
» 

Thibadeau has found himself, in the enviable position for a 
modeller of having an extremely rich body of data against which the 
performance Qf his program can be evaluated (cf«, Just & Carpenter, 
19B0)» flftnd, in fact, the program does quite reasonably; without 
special tuning^o^ parameters r Thibadeau, Just, &» Carpenter (1981) claim 
that READER-accoun^Sgjfor 79% of the* variance in their data, in contrast 
to the 72* accounted for by the model offered by Just & Carpenter 
(198Q)> 
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e However, the principles embodied in the program are of, even 

greater interest than its account of the data, because Thibadeau has. 

done an especially impressive job of embedding his model of performance 

at a particular task within an information processing architecture of 

great potential generality. To see this, we need to look more closely 

at CAPS (Thibadeau, 1981), the interpreter for the language in which 

READER was implemented. 

> / 
CAPS, which stands for "Collaborative Activation-based Production 

System " , is a LISP interpreter for a language oriented towards 

j - " 

concurrent processing of hypotheses at multiple «► levels. Its 
fundamental processing units* are productions, independent 
condition-action rules. Its fundamental data objects are propositions, 0 
consisting of node -relation-node triples with an associated activation 
level- Actfva*tion "represents the system's current confidence ^or 
certainty that the * proposition Is correct.* The conditions * of 
productions specify some set of propositions, along with threshold 
activation levels for« each, below which tbe production will not be 
eligible for execution. CAPS executes all "productions whose conditions 
are satisfied. Once a production becomes eligible for execution, it 
continues to fire on each processing cycle until some event occurs that 
causes * it to stop.' The primary action of a production is altering the 
activations of specific propositions by some 'proportion *«f the 
activation of one of th^ production's evoking propositions* 

Figure 6 illustrates this by showing the general "form of 'CAPS 
productions, and a hypothetical example paraphrased into English. The 
example can be paraphrased .further as saying, "If you think you'jre 
seeing the letter T, but only if you think it's, starting*^ new word, 
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Figure 6 



GENERAL FORM OF CAPS PRODUCTIONS 

j * 

(p production-name 
( propositions to send activation 
context in which to send 
. conditions for starting firing r 
conditions for stopping firing ~ I 
--> • - < 

( <spe w> from sending prepositions *• 
- /o targe/ propositions 

fl/irf side-effect propositions ) )) 



EXAMPLE (PARAPHRASED INTO ENGLISH) 

(p Letter-to-word 

( i&» /e/ter 5gg/? Ws "J" , actuation 0.2 or greater 
the letter begins a nm wwi\ activation 0.3 or greater 
the word seen, is HIRElTactivation 0.01 or greater 
the word seen is "THE" , activation 0.999 or less , 

•> 

(<spew> from.the letter seen, ms T 
to the word seen is "THE" ))) 

1 . " - 
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and you also, think that 'the word might be THE, then increase your 
certainty that the word in fact is THE by a proportion of your 
certainty that you've seen a T." Note that the conditions are specified 
in ,sucfc a way cfrat the production would begin to fire when ,tne 
hypotheses first began to be entertained, and would" stop firing when 
the target hypothesis is either accepted (activation greater than .999) 
or rejected (activation drops to zero). ' . 

- ' • /' 
• • • * . * 

In actual, CAPS productions, the proportion of activation 

transmitted is x specified in the production,-but that proportion is 

actually a multiplier for a global parameter which can be adjusted by 

an action of productions called "<REWEIGHT>". This is one of a number 

of actions that allow the system to modify the rate at which activation 

flows fronT" one hypothesis to another, along with thresholds for 

acceptance; or rejection". ' v 

• . */ # 

In short, Thibadeau has built not just a model of reading, but a 

very # general . processing language for implementing a large class of 

models based on a common theoretical' framework. His work is a very 

* \ 

nice example of how a concern wit^ control processes and focus of * 

c 

a ttention*can payoff, * 

4,3 Sloppy Errors: An Example Of Transfer. To New Domains 

There are many similarities between READER and McClelland & Rjimelhart's 
model, and many complementary features as well," Thibadeau offers a 
model of parsing processes and a general control structure, McClelland 
and Rumelhart provide an analysis of interactions in*the transmission 
of interaction under this sort of control structure — namely, the 
"friends and enemies" effect, the "rich get richer" effect, and the 
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"gang" effect^ They also offer m spme mechanisms for explaining ho(J 
expectations come Into play: context-dependent adjustment* of 'weights 
on links between hypotheses at different levels. Thibadeau, in turn, 
provides in CAPS processing mechanisms such as <REWEIGHT> that make it 
possible to model those adjustment processes- 

Together, they set the stage for a simulation of a seemingly very 

different topic, "sloppy" errors in algebra problem solving, which I am 

now working on in collaboration with James Greeno and Michael Jtanney. 

Greeno^ has collected a large body of protocols illustrating a* common 

and persistent problem^ . Novices make a large range of seemingly random 

errors, which they themselves can sometimes detect as*errors if asked 

to review their own work. These errors occur with much greater 

frequency in novices than experts. It is not that the subjects^Jiave 

missing or incorrect rules for solving the problems, since th£y can 

t / * * 

identify their own errors. Nor ii it that they have buggy rules: (Brown 

& Burton, 1978), since they can 3 identify the correct actions and* since 

the errors." do not consistently occur, * J « 

*' 

Th^ynodel we are developing to account for _ these observations 
p08$\*late8 an activation-base^ parsing process, like «in Thibadeau' s 
that is trying to build, an internal representation of an input 
algebra expression, \The effects that McClelland & Rumelhart outlined 
can cause £he system to misdate some of its hypotheses, 'about < the 
content of expressions. If one of the wrong hypotheses is accepted 
before the correct ^hypothesis has time to gain sufficient strength, an 
error will occur through the system applying correct algebra rules to 
incorrect data. In our model, learning to avoid errors -has* two 
components: learning the appropriate thresholds for accepting 
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hypotheses of various types, and learning the correct 'weights to be 
used in taking pne hypothesis as supporting another* 

' What these examples^ illustrate is one of the most ^important 

properties of the simulation approach: the 'development of general 

concepts of information processing mechanisms* Regardless of the 

particular topic area, all simulation systems must solve the same 

problem: specification of . control processes that will produce 

appropriate focus of attention. That is, whatever the program is to 

do, ensuring that it actually does\ it requires specifying mechanisms 

that (Will select .appropriate actions in the proper sequence. Sifice all 

psychological simulations share the concern of modelling an intelligent 

system, general concepts about these control mechanisms may be 

developed which have applications in areas fa'r removed from their 
» 

origin. ' . * 

\ 

5.0 LANGUAGES JOR PSYCHOLOGICAL SIMULATIONS % ' 
\ f \ 

So far, I've been talking- about some simulations of interest and trying . 
to s sugge*st some principles which they illustrate. At this point, I'd 
like to shift gears a bit and consider the languages in which 
simulations are implemented. ' 

Although many different languages have been used * to write 
simulation programs for. psychology, historically the three most 
important are probably IPL7 SNOBOL, and .LISP, i These are the languages 
which introduced the key cbnpeptsof list processing, pattern matching, 
and function notation. 
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It's worth quoting two sentences about IPL-5 froi Sammet's (1969) 
review of -programming languages, because \\ey capture some critical 
points about the fate of many special-purpose languages. » The first 

* 

quote reads, "The most significant property of IPL-5 is .that it has a 
closer jiotational ^resemblance to assembly language than any other 
language in this book..." The second quote brings some other sad news, 
"The" implementation and development of this line of language stopped 
with IPL-5 because the people most vitally concerned were more* 
interested in the problems they were trying to solve than in further 
language development." N 

It is these two factors^ ease of use and certainty of support, 
that suggest why LISP caught on to a much greater extent then IPL. By 
and, large, it has>been such pragmatic factors that have influenced 
^ttempts to -develop simulation languages especially for ^psycho logy . It 
would be a little grandiose to count* the languages just mentioned as 
strictly psychological, since their development fell more within the 
bounds of AI and since they have also been put to use by other 
cognitive scientists (such as the MIT linguists whose work with COMIT 

' * £ 

¥ led to the development of SNOBOL) . 

*5.1 The First Generation Of Psychological Simulation Languages 

' ^ Therefore, the first generation of specialized languages should 
-probably be considered to have arrived in the early '70's with NewelV's 
(1973) PSG production system, Norman & Rumelhart's (1975)/* fi&OD, 
interpreter for- the language SOL, and Anderson's (197'?) ACT model. 
These arrj all systems in which a number of specific simulations have 
been implemented, but where the system itself was an object o/ 
• ** * 

3S 
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psychological interest because it was 7 seen a^an ana-logy to at least 
some global aspects of the human information processing system. Newell 
emphasized etent-driven processing and working memory limitations. 
Norman & Rumelhart emphasized long-term memory and' the notion of 
"active semantic networks". Anderson's system tries to integrate r all 
of these concerns. I vill refer to all such systems as, "whole-system" 
simulations;^ it is important to distinguish them from 
"special-purpose" programs intended to , simulate performance » in a 
. particular domain. _ § m 

It is worth noting that, although their developers are still' 
active i'n simulation work, all three 6f the systems just. named have 
been phased out-. Their developers seem to have turned, instead, to 
special-purpose programs % designed to explore restricted aspects o*f 
verbally specified theories'. Rumelhart's model of word perception was 
implemented in a program that did only that (McClelland & Rumelhart, 
1981). Rumelhart & Norman (19Bf) have developed a complementary model 
of typings again, implemented in a special-purpose program. Anderson 
has implemented some of his recent ideas about knowledge compilation as* 
a l^alrning mechanism (Neves. & Anderson, l981)»not,in his own ACTF 
program, but in a simpler. jjroduct'fon system architecture which retained 
only those features pf ACT deemed immediately "relevant to the task at 
hand . 

Their new work is quite consistent 'with their old, so the 
abandonment of the whole-system • simulations cannot be taken as a 
rejection of the theories*^' Rather, it seems 4 more a question of 
practical matters. .I'd like to speculate on a number of factors that 
lead researchers to abandon large systems. ' 
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- The systems become slow, arid expensive to run; there is a feeling 
that the cost is not justified when portions of the system are not 

directly^pelated to the current topic of interest. 
* t 

- The problems of developing and , debugging a system grow as it 
'increases in complexity; * trained psychologists * may ' prefer 

- m psychological research to hardcore computer science. 

* Demand from others for chances to use the system are generally low; 
many res'earchers, even if they have the facilities to bring up the 
program at their own site, are hesitant- to do * so due * to the 
theoretical unwillingness to buy an entire set of assumptions; and 
to the pragmatic fear* of poor maintenance. 



r 



- At the same time, the demands of the few -who are interested in 
adopting the system can>*eome burdensome; ;one hesitates to- commit 
the resources 'required for documenting and extending a system 'in 
order to make it usable outside the lab. (Norman and Rumelhart, 
who produced a manual for their MEMOD system running over 100 
pages, are a notable exception to this remark^) * 

There are a number of advantages of pre-existing languages like 

LISP that make these v difficulties seem especially discouraging. LISP* 

is available on a wide range of machines in more£br-less compatible 

*■ \ » • 

dialects (e.g., DEC KL-lOs and 2Q» jf VAXes, IBM ^60' s). with the 

exception of MIT's MACLISP variant, reasonably clear documentation- is 

readily accessible*. The language is- farrly well-structured, 

symbol-oriented, and has many list processing ^iwj stria** manipulation 

constructs. It is relatively m easy to define new data structures. 

Last, but by no means least, most variants of LISP o'ffer fairly useful 



Page 38 



interactive debugging and trace methanisms*. 



9t -TOus, it ma/ seem that the , trends favpr small special-purpose 

simulation programs.- However, °to balance the picture, there are cwo 

points to consider. First of . all, there ,are new Whole-system 

simulations 0 being developed. Thibadeau's (1981) CAPS and my own HPM 

\ ' / 

(Ne^hes, *l981ab> ?re two^ examples of such systems. Second, the way 

thdt *j/^ PS ,HPM Wefe < L evel °P* d show that'there are some benefits to 
the wijpie-system -approactt in terms of generality and .* understanding »of 
unexpected inter-relations between components of the information 

prbcessing system. * 

. • • / t ' \ / 

Although it^ay< turn-out that the CAPS and HPM efforts are subject 
to ,the same pitfalls as previous whole^ystein simulaffcpTrs there is 
another system under development which attempts to '* steer a middle 

"course \* be tween r ^e alternatives of special-purpose modelling and 
whole-system simulation. That system is called PRISM, for Program for 
Research Into Sglf -Modifying systems , and is being developed by Pat 
Langleyof CarnegierMellon University and myself , (Langley 6 Neches, 

'1981)/ 1 ' \ 

5.2 The PRISM Production System Architecture 

PRISM is a production system interpreter implemented by augmenting 
LISP with a number of special functions.' It owes a major debt to 

Forgy's (1979) 0PS4, from* •which a large portion, of its code is 

s * f 

borrowed. " »• 
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Production system programs are more difficul-t 'to follow than 
traditional programs, because of their many conditional rules and the 
absence of an explicitly specified order of execution for the rules, 
this has probably been a major factor in limiting their acceptance. 
Nevertheless, there are a number of attractive properties to production 
systems, as Newell & Simon (1972, pages^ 804-806) and Langley, Neches , 
Neves, & Anzai (1980) have pointed out. They can model both 
goal -driven and data-drivep processing, the program organization offers 
a closer analogy Vo human memory limitations than other programming' 
formalisms, and the relative independence of individual production 
rules gives programs a degree of modifiability which might facilitate 
models of learning processes* . 

\ * - ' - 

The design philosophy underlying PRISM is, that there^ are too" many. 

Unresolved questions about the details of how a production system.. 

should work* Thus, it is premature to fix a particular set of choices 

and try to impose them upon users* Instead, PRISM seeks to identify 

the key choice points in specifying a production system architecture, 

offer plausible options at those points, and make if, easy for 

sophisticated users to implement alternatives to those options* Thus, 

rather than being a* whole-system- simulation of a particular information 

" * * «r » 

processing theory, PRISM defines a class of theories, and leaves it .to 

* ]' 
a the user to 'specify the details* * 

• * 

i 

f In order to do this, PRISH expands somewhat upon the traditional 
View or" a. production system as consisting of a data memory and* a 
production memory, "w^.th productions^>eing selected and applied in a 
repeating "recognize-act" oycle* Figure 7 shows the general structure 
of the PRISM sys.tem. Fixed components are shown as rectangles, those 
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involving user-controlled options are shown as- circles. Arrows 
indicate information flow. 

For example, PRISM divides the process of modifying memory- -into 
three components: add-to- wm, which puts propositions into working ' 
memory for temporary storage; add-to-net , which puts propositions into 
long-term semantic memory; and, add-connections , which ties 
propositions to others in a way that permits activation to pass between 
them. Almost all operations performed by #RISM can be specified by the 
user to be" either default actions (performed on all propositions 
v asserted as the action of a production) or special-case actions 
performed onlf on the propositions^ explicitly specified as their 

arguments. Thus, the user has case-by-case control over how these 

I 

operations are applied. 

Once a proposition enters working memory, it becomes subject t6 
-policies selected by the user for determining how long it will reside 
there. Among other things, users select a decay function to be used in 
computing* t how activation will decrease over time, along* with a 
threshold below which proposi^o'ns will be treated' as 'inactive. 

As Figure 7 shows, data can enter * active memory from' several 
directions. In addition to explicit assertions of new data/ old data 
aay return to active memory via a process of spreading actiyat^n, or 
associative retrieval. We have seen several examples *n this paperj • 
illustrating wh,y this is a useful c.pmpo'nent of a model* However, the 
details in those examples differed enough, for* it to be clear why 
options are worthwhile. PRISM offers three. options. * 



...... ii 
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The "Spread-to-depth" option assumes that activation is sent out 
only from a subset of active nodes, and travels with decreasing 
strength to all nodes within a specified, distance. The 
"Spread-to-limit" option also assumes that activation travels with 
decreasing strength from a subset of theiactive*nodes , but allows the 
activation to travel from node to* node until it drops below a threshold 

* level. The third option permits directed activation schemes similar to 
piibadeau's (1981), Like all PRISM options, it is relatively easy to 
implement alternatives to those supplied, since ali that is requited is 
to provide the name of a function which will be executed by PRISM on 

/the list of propositions from which >ctivation*is to spread, * 

That list -of propositions is .determined by choices made by the 
user; as with other functions, the associative retrieval functions may 
either be called as explicit actions of productions or specified as 
default actions* "to 'be applied to all propositions asserted by 
productions, ' 1 0 

V PRISM can operate with a wide range of policies' for selecting 

productions for execution, a process also known as "conflict 

4 . * — 

resolution" . This Xurns out to be one of the key points of difference 
between various production systems of fered # in the past. Anderson's 
( 1976) ACT,. for example, fired Some productions in parallel, but not 
all of those eligible for execution. The complex restrictions imposed 
by the system involved assumptions about varying lengths of time 
tegudred to select different productions, about generalized and 
specialized variants of productions, a'hd *so forth. Allen Newell (1980) 

m 

offered a model of the human information processing system designed to 
account for some effects in speech perception, in which he claimed that 
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all satisfied productions containing constants could fire on a given 
cycle, but only one production involving variables in its conditions. 
Thibadeau's (1981) CAPS, on the other hand, allows all matched* 

productions to fire. My own HPM (Neches, 1981a) divides productions 

i 

into seven classes, with different rules for each class, and fires the 
union of- the set of selections from each class, 

PRISM's scheme for selecting productions for execution is shown in 

* 

Figure 8. Like HPM, PRISM allows users to divide their set of 

production rules into independent classes which fire in parallel. In 

J PRISM, users can specify one to infinity such glasses, although the 

, default is that all productions are placed in one common class. For 

each class t^hat users *allow, they define a "filter", or set of tests 

which must be passed, for a production to be allowfed to fire- Those 

productions passing the first. test are sent on to the second, and 'so 

ont This* allows the user to specify a wide range of conflict 

resolution policies. • • 1 

♦ . 1 

PRISM also has a number of options related to modelling' learning 

processes * In a production system, learning is mainly simulated by 

building new productions or by modifying* pre-existing onest (I^is 

\. 

possible to also model- learning , in terms of changing or adding new 

declarative structures to long-term memory, of course, but there is no 

need to offer any special options in order for that to be dpne in 

i 

PRISM.) . ' * 

* / • 

Note that the ability to^model learning easily has long been a 

¥ promise for production systems, ever since Newell & Simon, ( 1972) 
" t * * \ < * 

started arguing for production Systems as a formalism' capturing* key 

properties of the human information processing system. The argument 
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has essentially been that learning models would be easier to implement 
than in traditional programming formalisms because of the modular 
properties of condition-action rules, with each production specifying 
the range of ,situat^ons in which it's applicable, independent of all 
, other productions (2), Up untijl quite recently, this promise was 
little more than just a promise. In the last few years, though, 
several different simulations have been developed in the formalism of 
self-modifying production systems^ (*e.g., Anzai S Simon,. 1979; 
Anderson, & Kline, 1979; Anderson, Kline, £ Beasley, 197$; * Langley, 
Heches,— 1981ab; Neves, 1978; Neves r &" Ah"del^o^7^981)r~The"~ 



models which have been offered have -incorporated several different 

/ ; 

features, and PRISM offers options, related to each: j ' ' \ " 

- Trace data: several learning models (e«g., Anzai/ £ Simon, 1979; s 
Langley., Necijgs, Neves, & Anzai, 1980; Neches, I981ab) depend 
heavily on a syrtem's memory for past' -actions. , PRISM offers 
options that allow users to determine the, form and content of the • 

^ memory, representation that* is built after each production 
execution* „ * » 

^ »•«••• 

— 4 

- Designation: since Walteraan e U975) , building new productions has 
been a stable feature of 4 production system models of learning, 

j pklSM contains a number* of options governing the^ form of new 
productions constructed by pre-existing* productions. ' 

6 t • « 

- * Strengthening and weakening: PRISM offers^ options governing means 
for altering ^he likelihood of a particular production being 1 ' 



Fpotnote 2: ^ This assumption puts a heavy burden 'on processes for 
selecting appropriate productions for' firing, one reason why'pRISM is 
designed nith such a generalized view of conflict resolution, . 
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selected for firing. 

Generalization: there are also options governing mechanisms for o 

expanding a production's range of applicability^' through 

\ 

substitution of variables for constants in the production's 
conditions. 



Discrimination: there are a parallel set" of options governing 
mechanisms ,for restricting a --production' s range of applicability 
through the insertion of additional conditions. 



In summary, simulation work-in PRISM starts with specifying a 

i 

processing environment that controls how productions will be 

interpreted, ^he environment also includes lotog-terra memory, active 

Wbrking memory/ and processes which manage their* contents^ learning 

mechanisms* The system is built on top of LI§P, and can therefore 

'implement any knowledge representation which can be expressed as LIsV 

data structures. PRISM can be thought of at two levels: either as a , 

kit from which whole-systfem simulation packages can be assembled, or 

simply as a programming language which collects features found to have 

been convenient in other systctas for cognitive simulations. v /, 
» • 

f There are several motivations behind the development of the PRISM 

system. ' Production systems have been a useful simulation tool, but ^it 

is simply too^ carry *fo 4 r any consensus to have arisen about \ the most 
* \ 

useful form fcrr a production system language to take • XRISM is 
intended to let researchers pick and choose the best combination^&£- 
features for their particular purposes, without being forced to build a 
complete system from scratch. As I suggested in earlier sections?** 
there is a strong gain, from the exercise of trying to work within a 

r 
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whole-system simulation. We hope that systems like PRISg, by 
encouraging researchers to specify whole systems, will promote a 
greater concern with the interactions between components — tha't is, 
with the question, of how the pieces of the puzzle are going to fit 

i 

together, At, the same time, PRISM'S system of options, and the fact 

«• 

that it is built on top of, a powerful programming language like blSP, 
are intended to make it relatively easy to modify and extend." This 
property of flexibility means, we hope, that models of particular tasks 
can be implemented within whole-system*simulati6ns without being *fo reed' 
into the Procrustean bed of a fixed system. 

> . * I 

6.0 CONCLUSION J 

One of the most exciting things about simulation work is that, 
because of its necessary concern with contrpl of processing and focus 
of attention issues, ideas can come out of a simulation project that 
a^| applicable in areas^quite different from the domain in which the 
original work was .done. ' I've'trieti to illustrate that*, point In the 
<v examples of simulation which * I've presented." I have aUTo tfttd to 



Wuch on » numBeTr oft^fabtors- whicti are making simulation work easier 
and- more accessible than ever tefor^,v^3*e£?actor Jthe^belopment of 
simulation languagtsj like* CAPS and .PRISM, Jfcjj&k ,»gt^f orojs, ^ their 
users to accJptAny single^ theory of* the^Jfttafcn* fnfd&iitlon 'processing* 
system, but provide f rameworksj in which models of the^ sygtenj* or 
components o'f the whole system — : can » developed' and e^t£lqred. 
Another factor is the development of lower kost machines, *sucl§ i© 
VAXes, with more powerful capabilities.. A third factor is t 
increasing availability oiv these "machines of core languages such as 
LISP, which facilitate direct implementation of special-purpose 
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simulations in addition to providing a foundation upon which simulation 
languages more specific to psychology can be constructed « 



AC the same time, though, I woulti like to avoid a presentation* 

from the messianic genre. As " we * have seen, there are a number of 

advantages which have been claimed for the ^jfciulatton approach that 

really do not hold up in actual practice^ A computer simulation does. 

not necessarily guarantee that a, theory is more consistent or 

comprehensible. Nor does a programme successful performance guarantee 

that the theory is generalizabie , or even that the causes fw Cm 

success are those predicted by the theory. The psychological] 

* significance of a computer program can only be determined by close ^nd 

< 

careful examination of each piece of work on a case.-by-case pasis; 

t 

There are also some practical limitations q|ftch limit the /spread 

* 1 
of simulation work for some time to come. It *£s still time-consuming 

^ and hard to delegate. Interesting projects often have many of ftheir 

payoffs only at the end,, with fewer publ^srable milestones along the 

way. Computer hardware and software facilities are not always being 

planned with the potential for simulation work in mind. 

, 1 These difficulties are due in part to the 'fact that the promise of 
simulation methodology — the different levels at which it can 
stimulate thought about psychological issues — is not as widely 
appreciated * as it could be. I have tried in this paper to illustrate 
some of the ways in which simulations can aid us in thinking ^and 

reasoning about the human mind. They provide a tool for empirically 

<> « 

analyzing theories to better understand their implications and 

*. * 

predictions. They ~£x*/'a means of exploring interactions between 
components of complex" models . They pose a practical .challenge to 
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operational ize theoretical constructs, which can lead to incidental 
discoveries # a bout relaled processes. And, finally, they engender a 
concern with issues of process control that contributes tg^ the 
development of general principles "with broad applications. 
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